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We obtain exact expressions for the asymptotic behaviour of the average probability of the block 
decoding error for ensembles of regular low density parity check error correcting codes, by employing 
diagrammatic techniques. Furthermore, we show how imposing simple constraints on the code 
ensemble (that can be practically implemented in linear time), allows one to suppress the error 
probability for codes with more than 2 checks per bit, to an arbitrarily low power of N. As such we 
provide a practical route to a (sub-optimal) expurgated ensemble. 

O 1 PACS numbers: 89.70.+C, 75.10.Hk, 05.50.+q, 05.70.Fh, 89.70.+C 

I. INTRODUCTION 

C/3 ' Recent research in a cross-disciplinary field between the information theory (IT) and statistical mechanics (SM) 
revealed a great similarity between the low density parity check (LDPC) error correcting codes and systems of Ising 
spins (microscopic magnets) which interact with each other over random graphs 1, 2]. On the basis of this similarity, 
notions and methods developed in SM were employed to analyse LDPC codes, which successfully clarified typical 
• properties of these excellent codes when the code length N is sufficiently large[^0,l3- 

In general, an LDPC code is defined by a parity check matrix A which represents dependences between codeword 
q ■ bits and parity checks determined under certain constraints. This implies that the performance of LDPC codes, in 
particular, the probability of the block decoding error Pb{A) fluctuates depending on each realization of A. Therefore, 
the average of the decoding error probability over a given ensemble Pg is frequently used for characterising the 
performance of LDPC code ensembles. 
, Detailed analysis in IT literature showed that Pb of naively constructed LDPC code ensembles is generally composed 
of two terms: the first term which depends exponentially on N represents the average performance of typical codes, 
whereas the second component scales polynomially with respect to N due to a polynomially small fraction of poor codes 
in the ensemble 0, Q ■ This means that even if the noise level of the communication channel is sufficiently low such 
that typical codes exhibit exponentially small decoding error probabilities (which is implicitly assumed throughout 
this paper), communication performance can still be very low exhibiting an 0(1) decoding error probability with a 
polynomially small probability when codes are naively generated from the ensembles. As the typical behaviour has 
mainly been examined so far, the polynomial contribution from the atypical codes has not been sufficiently discussed 
in the SM approach. Although this slow decay in the error probability would not be observed for most codes in the 
ensemble, examining the causes of low error correction ability of the atypical poor codes is doubtlessly important both 
I ■ theoretically, and practically for constructing more reliable ensembles. 

The purpose of this paper is to answer this demand from the side of SM. More specifically, we develop a scheme 
to directly assess the most dominant contribution of the poor codes in Pg on the basis of specific kinds of graph 
configuration utilising diagrammatic techniques. This significantly simplifies the evaluation procedure of Pb compared 
to the existing method [6J, and can be employed for a wider class of expurgated ensembles. Moreover, it provides 
insights that leads to a practical expurgation method that is also presented in this paper. Finally, the validity of the 
evaluation scheme and the efficacy of the proposed expurgation technique are computationally confirmed. 
■ The paper is organised as follows: 

-In the next section [HI we briefly review the general scenario of LDPC codes and introduce basic notions which are 
necessary for evaluating the error probability in the proceeding sections. 

-In section IIIII we introduce the various code ensembles we will work with, and different representations for a code 
construction. 

-In section llVl we link the probability of having a code with low minimal distance to the polynomial error probability, 
and we calculate the polynomial error probability by diagrammatic techniques for various code ensembles. As we can 
explicitly link it to the occurrence of short loops, the distribution of occurrence of such loops is also determined. 
-In section^ we present a practical linear time (in average) algorithm to remove short loops from a code construction, 
thus reducing the polynomial error probability to an arbitrarily low value. This is backed up by numerical simulations. 
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-Finally, section IVT1 is devoted to a summary. 

-Technical details about the diagrammatic technique can be found in appendix 1X1 

-Details about the link between the minimal distance and the polynomial error probability for various decoding 
schemes are presented in appendix FBI 

II. LDPC CODES, DECODING ERROR AND WEIGHT OF CODEWORDS 

We here concentrate on regular (c, d, N) LDPC code ensembles which involve N message bits and L = cN/d parity 
checks. Given a specific code, each message bit is involved in c parity checks, and each parity check involves d message 
bits. In practice, this dependence is represented by a parity check matrix A. An encoding scheme consists in the 
generation of a codeword t £ {0,1}^ from an information vector s £ {0,1}^ (with K — N — L) via the linear 
operation t = G T s (mod 2) where G is the generator matrix that satisfies the condition AG T = (mod 2). The code 
rate is then given by R = K/N, and measures the information redundancy of the transmitted vector. 

Upon transmission of the codeword t via a noisy channel, taken here to be a binary symmetric channel (BSC), the 
vector r = t + n° (mod 2) is received, where n° £ {0, 1}^ is the true channel noise. The statistics of the BSC are 
fully determined by the flip rate p £ [0, 1]: 

P(n1) = (l-p)S n o !0 + P 5 n o A (1) 

Decoding is carried out by multiplying r by A to produce the syndrome vector z = Ay = An (since AG T = 0). In 
order to reconstruct the original message s, one has to obtain an estimate n for the true noise n°. One major strategy 
for this is maximum likelihood (ML) decoding and is mainly focused on in this paper. It consists in the selection of 
that vector h ML that minimises the number of non-zero elements (weight) w(n) = Xw=i n i satisfying the parity check 
equation z = An. Decoding failure occurs when h ML differs from n°. The probability of this occurring is termed 
as the (block) decoding error probability Pb{A), which serves as a performance measure of the code specified by A 
given the (ML) decoding strategy. 

It is worthwhile to mention that for any true noise vector n°, n = n° +x (mod2) where x is an arbitrary codeword 
vector for which Ax = (mod 2) holds, also satisfies the parity check equation An = An = z (mod2). The set of 
indices of non-zero elements of x is denoted by X(x). We denote the probability that a given decoding strategy (DS) 
will select n = n° + x with w(x) — w rather than n°, as Pos(e|w). The ML decoding strategy fails in correctly 
estimating those noise vectors n° for which less than half of n° eI ^ are zero, since the weight of n° + x(mod2) becomes 

smaller than w(n°). To noise vectors n° for which exactly half of n° i&1 ^ are zero, we attribute an error 1/2, since 
the weight of n° + x(mod2) is equal to w(n°), such that 

int((iu-l)/2) . . . i / \ \ 

PML(e\w) = J2 ( W i )P V '- i (l-P) i [ +(-vc„) 5 (| J(p(l-P)r /2 J ~®(P f ) (2) 

where int(cc) is the integer part of a; (for P{e\w) for other decoding schemes we refer to appendix [BJi . The minimum 
of w(x) under the constraints Ax. = (mod2) and x ^ 0, is commonly known as the minimal distance of A and is 
here denoted as W(A). It provides a lower bound for the decoding error probability of the ML scheme as Pb(A) > 

Q ' p int(W(A)/2)y 

Gallager [7| showed that for c > 3 the minimal distance grows as O(N) for most codes characterised by the 
(c, d)-constraints, which implies that the decoding error probability can decay exponentially fast with respect to N 
when p is sufficiently low. However, he also showed that the minimal distance and, more generally, the weights 
of certain codeword vectors become 0(1) for a polynomially small fraction of codes when the code ensemble is 
naively constructed, which implies that the average decoding error probability over the ensemble Pb exhibits a slow 
polynomial decay with respect to N being dominated by the atypical poor codes. As Gallager did not examine the 
detailed properties of the poor codes, it was only recently that upper- and lower-bounds of Pb were evaluated for 
several types of naively constructed LDPC code ensembles 6]. However, the obtained bounds are still not tight in 
the prefactors. In addition, extending the analysis to other ensembles does not seem straightforward as the provided 
technique is rather complicated. The first purpose of this paper is to show that one can directly evaluate the leading 
contribution of Pb by making a one-to-one connection between occurrence of low weights in codeword vectors and 
the presence of some dangerous finite diagram(s) (sub-graph(s)) in a graph representation of that code. 

In order to suppress the influence of the atypical poor codes, Gallager proposed to work in an expurgated ensemble, 
where the codes with low minimal distance are somehow removed. However, a practical way to obtain the expurgated 
ensemble has not been provided so far. The second purpose of the current paper is to provide a (typically) linear-time 
practical algorithm for the expurgation and we numerically demonstrate its efficacy. 
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III. CODE ENSEMBLES AND REPRESENTATIONS OF CODE CONSTRUCTIONS 



A. 



Code ensembles 



As mentioned in the previous section, evaluating the distribution of low weights of the poor codes in a given 
ensemble becomes relevant for the current purposes. This distribution highly depends on the details of the definition 
of code ensembles. We here work on the following three ensembles defined on the basis of bipartite graphs (Fig. 



FIG. 1: An example of a regular bipartite graph with (c,d) = (4,5). On the left the vertices (message bits) (full circles), and 
on the right the edges (parity checks) (empty circles) 

• Miller-Burshtein (MB) ensemble: Put N vertices (message bits) and L edges (parity checks) on the left 
and right, respectively. For each vertex and edge, we provide c and d arcs, respectively. In order to associate 
these, the arcs originating from the left are labelled from 1 to Nc(= Ld), and similarly done for the right. A 
permutation n is then uniformly drawn from the space of all permutations of {1,2,..., Nc}. Finally, we link 
the arc labelled i on the left with the arc labelled m on the right. This defines a code completely determining 
a specific dependence between message bits and parity checks. The uniform generation of -Ki characterises the 
ensemble. Note that in this way multiple links between a pair of vertex/edge are allowed. 

• No multiple links (NML) ensemble: Multiple links in the bipartite graph possibly reduces the effective 
number of parity checks of the associated message bits, which may make the error correction ability weaker. 
The second ensemble is provided by expurgating graphs containing multiple links from the MB ensemble. 

• Minimum loop length £ (MLL-^) ensemble: In the bipartite graph, length £ loops are defined as irreducible 
closed paths composed of I different vertices and £ different edges [2(j. As shown later, short loops become a 
cause of poor error correction ability, as they allow for a shorter minimal distance. Therefore, we construct code 
ensembles by completely expurgating graphs containing loops of length shorter than £ from the MB ensemble, 
and examine how well such expurgation contributes to the improvement of the average error correction capability. 
Note that the MB and NML ensemble correspond to the MLL-1 and MLL-2 ensemble, respectively. 



Although the ensembles above are constructed on the basis of bipartite graphs, for convenience we will also use 
other two representations: 

• Monopartite (hyper)-graph representation: Each message bit is represented by a vertex. The vertices are 
connected by hyper-edges (each linking d vertices) , each vertex is involved c times in an hyper-edge (see FigEJ • 

• Matrix representation: L rows, N columns, where a ev is the number of times vertex v appears in edge e. 
For regular (c,d,N) codes, the following constraints on {a e ^} apply 




B. Representations of code constructions 




Ve = 1, ..,L, 
Vv = 1,..,N. 



(3) 



FIG. 2: An example of a (small) regular hyper-graph with (c, d, N) = (2, 3, 9) 



For clarity, we always use indices v,w, .. = 1, ..,N to indicate message bits (or vertices), and indices e, /, .. = 
l,..,L to indicate parity checks (or edges). Note that the parity check matrix of a given code is provided as 
A = {a ev } (mod 2). For the NML and MLL-£ > 3 ensembles, the matrix elements are constrained to binary 
values as a ev = 0, 1. Therefore, {a ev } itself represents the parity check matrix A in these cases, which implies 
that every parity check is composed of d bits and every bit is associated with c checks. However, as Qj£v can 
take any integer from to c in the MB ensemble, it can occur that certain rows and/or columns of the parity 
check matrix A are composed of only zero elements, which means that corresponding checks and/or bits do 
not contribute to the error correction mechanism. Note that in the matrix notation the exclusion of ^-loops 
corresponds to the extra (redundant) constraints (additional to a ve = 0, 1, and ©): 

/ 

Y\_0'v i e i av lHl)modl e i =0 y{vi,i = l..l}, {e t ,i = 1..1}. (4) 
i=i 

where {v/ei,i = is a group of I different vertices/edges. 

There is a one-to-one correspondence between the bipartite graph, and the matrix representation of the codes. Note 
however, that with each monopartite graph correspond a number of (identical up to permutation of the edges) of 
bipartite graph/matrix representations. 



IV. DIAGRAMMATIC EVALUATION OF ERROR PROBABILITY 



A. Low weights and most dangerous diagrams 

Now, let us start to evaluate the error probability. For this, we first investigate necessary configurations in the 
bipartite graph representation for creating codeword vectors having a given weight. 

Assume that a codeword vector x which is characterised by Ax. = (mod 2) has a weight u>(x) = n v . As addition 
of zero elements of x does not change parity checks, we can focus on only the n v non-zero elements. Then, in order 
to satisfy the parity relation Avl — (mod 2), every edge associated with n v vertices corresponding to these non-zero 
elements must receive an even number of links from the n v vertices in the bipartite representation. 

Let us now evaluate how frequently such configurations appear in the whole bipartite graph when a code is generated 
from a given ensemble. We refer to a sub-set of the n v vertices as V/. Each vertex v is directly linked to a subset 
£ (v) of the edges. We denote £(V/) = {J vl£Vf £{v), and n e = |£(V/)|. Then there are cn v links to be put between V/ 
and £ (V/). Note that there are exactly c links arriving at each v G V/, and d links arriving at each e G £(Vf)- Each 
diagram consists in a combination (V/,£(V/)). For an admissible diagram, we have the extra condition that each 
e G £{Vf) receives an even number of links from V/, such that the bits in V/ can be collectively flipped preserving 
the parity relation ^4x = (mod 2)). 

Note that we ignore the links arriving in £(Vf) from outside the set V/. For admissible diagrams, n e is limited 
from above by int(^-), where int(x) is the integer part of x. A number n e of the links can be put freely, while the 
remaining cn v — n e links all have to fall in the group n e (out of L), such that it can easily be seen that each of those 
(forced) links carries a probability ~ A^ 1 . 

There are ( ] ~ N rH ' /n v \ ways of picking n v (<$^ N) out of N vertices, such that each diagram consisting of n v 
vertices and n e edges carries a probability of occurrence proportional to ~ ]\f n v-{n v c-n c ) _ 
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This observation allows us to identify the "most dangerous" admissible diagrams as those with a probability of 
occurrence with the least negative power of N, i.e. that combination of (n*, n*) that maximises n e — (c — l)n v . The 
collective contributions of all other diagrams are at least a factor smaller, and therefore negligible. From this, 
it immediately follows that n e must take its maximal value which is n* = int( i ^-), while n v has to be minimised, 
compatible with the constraints of the code ensemble under consideration. Hence, the probability Pf — Pf{n*) that 
a generated graph (code) contains a most dangerous diagram including n* vertices, scales like 

PfinD-N^-^. (5) 

with the constraint on n* that ^f - is integer. Note that for all ensembles we consider in this paper n* ~ 0{1). 

At this point, we make some important observations: 

- Firstly, from eq. JSJ it is easily seen that for c = 2, any diagram containing an equal number of vertices and 
edges scales like N°. The number of diagrams contributing to Pb becomes infinite, and Pb ~ 0(1). It was already 
recognised by Gallager [7( and Q that regular (2, d, N) codes have very bad decoding properties under the block 
error criterion, which is currently adopted. Therefore, in what follows, we will implicitly assume that 2 < c < d. 

- Secondly, as eq. JSJ) represents only the dependence on the code length N, for an accurate evaluation of the 
asymptotic behaviour (for N — > oo) of the error probability, we have to calculate the prefactor. Nevertheless, this 
kind of power counting is still highly useful because this directly identifies the major causes of the poor performance, 
and allows us to concentrate on only a few relevant diagrams for further calculation, ignoring innumerable other 
minor factors. This is more systematic and much easier to apply in various ensembles than the existing method of 
_ 

- Thirdly, once Pf ~ Pf(n*) is accurately evaluated, the average block error probability Pb for any decoding scheme 
DS and for sufficiently low flip rates p can be evaluated as 

Pb ds - P BDS (e\n* v )P f , (6) 

where e.g. for ML decoding -Pe MI/ (e|n*) is given in (J2J (the expressions for other decoding schemes can be found in 
appendix [5J . 

- The NML and MLL-£ ensembles are generated from the MB ensemble by expurgating specific kinds of codes, which 
slightly changes the distributio of codes such that the above argument based on free sampling of links in constructing 
graphs might not be valid. However, for n v ~ O(l) the current evaluation still provides correct results for the 
leading contribution to Pf(n v ), since the influence of the expurgation procedure has a negligible effect on the leading 
contributions of the most dangerous diagrams. 

- Finally, we note that this analysis essentially follows the weight enumerator formalism 0, Q , which can be regarded 
as a certain type of the annealed approximation • Although we have argued that such formalism is not capable of 
accurately evaluating the performance of typical codes that decays exponentially with respect to N \H\ , the current 
scheme correctly assesses the leading contribution of the average error probability Pb of the above ensembles, which 
scales polynomially with respect to N being dominated by atypically poor codes. 

This is because the probability of occurrence scales like N"" (with a > 1) for all admissible diagrams. Therefore, we 
can safely ignore the possibility that more than one such diagram occurs in the same graph (as illustrated in fig. 0, 
and to leading order for Pf, we can simply add the contributions of all most dangerous diagrams. This is not so for the 
typical case analysis (with n* ~ 0(N)), where exponentially rare codes may have an exponentially large contribution. 
In order to avoid this kind of over counting, a quenched magnetisation enumerator based treatment is then more 
suitable ^lj. Note furthermore that in the SM treatment of typical codes, the polynomial error probability is hidden 
in the ferro-magnetic solution (since ~ w(n \n ^flipped) w ] ieil n * ^ o(iV)), and is therefore easily overlooked. 



B. Evaluation of error probability for various ensembles 

Once the notion of most dangerous diagrams is introduced, the asymptotic behaviour of the probability Pf for 
sufficiently low flip rates p is easily evaluated for various ensembles. 

• MB ensemble: For the MB ensemble, multiple links between a pair of vertex/edge are allowed, which forces 
us to make a distinction between even and odd c. For even c the minimal admissible n v is 1, such that the error 
probability will scale like iV 1_ S, while that for odd c is 2 which provides a faster scaling N 2 ~ c . 

-For c even, the most dangerous diagram is given in FigOU and the probability Pf ~ Pf(n v = 1) is given by (an 
explanation of the diagrams, and how there multiplicity is obtained can be found in appendix A) 

P f (£ = l,ceven)^N^ (1 - i2)~# ' (7) 



Note that with "x ~ y", we indicate that u x = y (1 + ^yp-)" 



N 

For c odd the minimal admissible n v is 2, such that the error probability will scale like N 2 ~ c . The most dangerous 




FIG. 3: Left: n v = 1, n e = § (c is even). Right: n„ — 2, n e = c, fc = 0, 1, .., int(|) 

diagrams are given in FigOU (with k = 0, 1, .., int(i)), and the probability Pf ~ Pf(n v — 2) is given by (for details see 
appendix A) 



P f (£ = 1, c odd) ~ A^ 2 - c (1 - R)- 



(d-i; 



int(c/2) 

E 



2 2fc /c! 2 (c-2fc)! 



(8) 



One can check that the values Q and JSJ, which we believe to be exact (not bounds), satisfy the bounds given in 



• NML ensemble: In the NML ensemble, multiple links are not allowed. In this ensemble, even and odd c 
can be treated on the same footing. In both cases, the minimal admissible n v = 2, such that the error probability 
will scale like N 2 ~ c . The most dangerous diagram is given in hgO(note that in this case only k = is allowed). The 
probability Pf ~ Pf{n v = 2) is given by (for details see appendix A) 



P f (£ = 2) ~N 2 - C (1-R)- 



(d-1) 



2 



(9) 



• MLL-3 ensemble: In the MLL-3 ensemble, neither multiple links nor loops of length 2 are allowed. Note 
that this also implies that pairs of vertices may not appear more than once together in a parity check, such that all 
parity checks are different, making each monopartite graph correspond to exactly L\ bipartite graphs/matrices. In 
the monopartite graph any pair of vertices is now connected by at most one (hyper-)edge. One can easily convince 



FIG. 4: 1= 2- loops removed, n v = c+1, n e = ^iSpi. 
oneself that the minimal n v = c + 1 (each v needs c other vertices to connect to) , and the most dangerous diagram is 
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given in FigQ] The dominant part of Pt ~ P/(c+l) is given by (for details see appendix A) 



c(°fl) , -e(ofl) (id — 1) 



P f (l = 3) ~ iv c+1 — - fr-^ v " M r_ (io) 

1 ' 1 (c+i),^). 



• MLL-£ ensemble: In the general MLL-£ ensemble, neither multiple links nor loops of length I < £ are allowed. In 
general, identifying the most dangerous diagram(s) and especially calculating their combinatorial prefactor becomes 
increasingly difficult with increasing £, but we can still find the scaling of Pf relatively easily by power counting. To 
this purpose it is more convenient to use the monopartite graph representation. In Fig|S]we observe that for I = 3, 4 
and 5, the minimal n v is given by c+1, 2c and 2c + c(c— 1) = c(c+l) respectively, such that using (J5J we obtain 

P/0? = 3)~iV (c+1)(1 -5) p / (£ = 4)-iV 2c ( 1 -f) P f {t= 5) ~ Ar^i)(i-f) (11) 

For £ > 6, even finding the most dangerous diagram and thus power counting becomes quite difficult to do by hand, 



FIG. 5: Monopartite graph representations for some most dangerous diagrams. Left: c = 3, all loops of length / < 3 are 
removed. Middle: c = 5, all loops of length I < 4 are removed. Right: c = 3, all loops of length I < 5 are removed. 



but we can still easily upper bound the power by the following observation: from fig|S] we observe that for a given 
minimal allowed loop length t the minimum number of generations without links between them starting from any 
vertex v is given by int(-^-). Therefore the minimal n v is lower bounded by 



int(^i) 



fc=0 

which implies that Pf can be upper bounded as 



n v >l + c £ (c-l) fc = l + c n { J_ i) _ i , (12) 



P f (i)<0\N\ V J J (13) 



In fig. 0we have plotted the frequencies of occurrence of dangerous diagrams that scale like iV _1 . We have randomly 
generated 10 6 code realisations (for N ~ 50, 100, 200, 400 i.e. the nearest integer for which L = cN/d is also integer), 
and have plotted both the total frequency (multiplied by N) of occurrence of a diagram (dashed lines), and the 
frequency that a graph contains the diagram at least once (full lines). Note that in the limit N — > oo, both coincide, 
illustrating the fact that we can safely ignore the possibility that more than one such diagram occurs in the same 
graph. We observe that the extrapolations \/N — > are all in full accordance with the theoretical predictions. 

All this clearly illustrates how the exclusion of short loops reduces P/, and thus through © the polynomial error 
probability probability. Furthermore, from figs. 13161 it is clear that all most dangerous diagrams contain short loops. 
Knowledge of the distribution of the number of short loops in the various ensembles is therefore relevant for our 
current purposes, and we analyse the distribution of the number of £-loops in the next subsection. 
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c(c-l) 



c(c-l) 



AAA AAA 



c(c-l) 



FIG. 6: All loops of length I < £ are removed. The minimal size of the last generation can not be less than c(c — l) int '^ 1 '/ 2 ' 
without generating loops of length < I. 




f N 



i_ 

N 



FIG. 7: Frequencies (multiplied with N) of occurrence of dangerous diagrams that scale like N 1 . Left: c = 3, d = 4, 6 with 
k = both with (□) and without (+) removing 1-loops, and with k = 1 (x). Right: c = 4, d = 2. .6. 



C. The distribution of the number of £-loops 



In this subsection we investigate the distribution Pe(k) of the number k of £- loops for the various code ensembles. 
Note that we only consider irreducible ^-loops, in the sense that they are not combinations of shorter loops (i.e. they 
do not visit the same vertex or edge twice). Note that an £-loop in the monopartite graph corresponds to a 2£- cycle 
in the bipartite graph representation Q . The number of irreducible Moops in a random regular (c, d, N) graph (with 
N — » oo ) has the following distribution: 

P e (k) - P(#£ - loops = k) ~ ^ ex P (-A £ ), A, = (c ~ 1) ^~ 1) ' (14) 

For the derivation of this result we refer to appendix A. From (I14fl we observe that the average number of short loops 
increases rapidly with c and d. Furthermore we note the symmetry between c and d, which reflects the edge/vertex 
duality which is typical for loops. 

As explained in appendix A, the constraint that no loops of length I < £ are present in the graph, has no influence in 
the leading order to the diagrams for loops of length > I. 
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FIG. 8: The dominant diagram for ^-loops. 



We denote the number of codes in the ensemble where the minimum loop length is £ (i.e. loops of length 
I < I have been removed) by Me(c,d, N), such that the size of the original (MB) ensemble with all regular (c,d,N) 
codes is denoted by Mi(c,d,N). From lj*Ht|> it follows that the size of Me(c, d, N) is given by 



1-1 \ / t-i 



{ c -i) i (d-iy 



Af e (c,d,N)=exi>{-J2\ l j Af 1 (c,d,N)=e X p\-J2 l - ~ ) ^M.-^) ( 15 ) 

The reduction factor exp ^— Y]f—i A; ^ is 0(1), for any finite loop length £. Since the number of non-equivalent codes 
in the original ensemble A/i(c, d, N) ~ rapT! 1 ^ ne nna l ensemble 7V}(c, d, N) is still very big, but clearly smaller than 
Gallager's ideally expurgated ensemble which has a reduction factor of just \ 0. 

Note that the presence of (short) loops in a code does not only adversely affect the polynomial error probability, but 
also the success rate for practical decoding algorithms such as belief progagation [3, |25| . 



V. PRACTICAL LINEAR ALGORITHM TO THE f-LOOP EXPURGATED ENSEMBLE 



In this section we propose a linear time (in N) algorithm that generates codes and removes loops up to arbitrary 
length (the combination (c, d, N) permitting). We also present simulation results, which corroborate our assumptions 
about the validity of the diagrammatic approach as presented in this paper. Finally we give some practical limits and 
guidelines for code-ensembles with large but finite N. 



1. Generating a random regular (c,d,N) code: 



The algorithm to generate a random regular (c, d, N) code consists in the following steps: 

1. make a list of available vertices A v of initial length N av = cN, where each vertex appears exactly c times 

2. for each of the L = £j- parity checks, d times: 

(a) randomly pick a vertex from A v , 

(b) remove it from the list 

(c) N av = N av - 1. 

Note that in the process we keep construct the lists: 

1. v = 1..N, i = l..c containing the edges each vertex v is involved in, 

2. V.E[e][j], e = X..L, j — l..d containing the vertices each edge e involves. 

It is clear that this algorithm is linear in N. 
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2. Finding loops of length I in the code: 

We now describe the algorithm to detect (and store) all £(> 2)-loops in the graph: 

1. we consider all the vertices as a possible starting point vo of the loop 

2. given a starting point vq grow a walk of length £. Each growing step consists in: 

(a) take e; from EV\v{\ and check conditions for valid step 

(b) take vi+\ from Vi?[ez] and check conditions for valid step 

(c) if all conditions are satisfied goto next step 

else if possible goto the next e/ s -EV^-u;] or vi + i € VJ5[ej] 
else go to previous step 

3. finally check whether the end point of the loop vp = v , if so store the loop i.e. Cp = {(vi, e{), I = Q..£ — 1} 
The conditions for a valid step are the following: 

1. (a)ei^e l , (i = 0..Z-l) for I > 2, 

(b) ei > e for I = 2. 

2. (a) vi > v , vi ^ Vi (i = for I = 1.1-1, £ > 2, 
(b vi > vi for 1 = 1-1, £>2, 

(c) vi = Vo for I = £. 

Note that the conditions vi > vq fix the starting point of the loop, while the conditions vp-\ > V\ for £ > 2, and 
ei > eo for £ = 2, fix its orientation. This has a double advantage: it avoids over counting, and reduces execution 
time by early stopping of the growing process. 

For 1-loops (2-cycles), for all v = 1..N we simply look in BV[u] for double links to the same edge, i.e. 2?V[u][i] = 
E^Mb]- Imposing that i < j, then avoids double counting. 

Since each vertex is connected to c edges, and each edge is connected to d vertices, the number of operations to check 
whether any vertex v is involved in a loop, remains 0(1) (compared to N — > oo). As we have to check this for all 
vertices, the loop finding stage of the algorithm is linear in N. 

3. Removing loops of length I from the code: 

We start by detecting and removing the smallest loops and than work our way towards longer loops. Assuming that 
all shorter loops have been successfully removed, and having found and listed all the loops of length £, the procedure 
for removing them is very simple. For all stored ^-loops Cf. 

1. randomly pick a vertex/edge (v p , e p ) combination from Cp = {(in, e/), I = 0..£ — 1} 

2. swap it with a random other vertex v s in a random other edge e s |27| . 

3. for v p and v s check whether they are now involved in a Z-loop with / < £. 

(a) if so undo the swap and goto 1. 

(b) else accept the swap, the loop is removed. 

This procedure of removing loops takes typically 0(1) operations. The typical number of loops of each length £ is 
0(1). For each loop we only have to swap one vertex/edge combination to remove it, the checks that the swap is 
valid take 0(1) operations, and we typically need only 0(1) swap-trials to get an acceptable swap. 
Although the algorithm is linear in N, the number of operations needed to detect and remove loops of length £ in a 
(c, d) code, grows very rapidly with c, d and even exponentially with £. Furthermore, we note that only in the N — > oo 
limit all short loops can be removed. In practice, for large but finite N and given (c, d), the maximum loop length £ 
is clearly limited. A rough estimate for this limit is given by 

I l-Cc-l)* ,l-(d-l)^\ AT 

max { c (i-W (iVi)) J~* (16) 

because v (resp. e) is not allowed to be its own l,2..£-th nearest neighbour (see fig. HJ). Hence, loops of logarithmic 
length in N can not be avoided. For practical (c, d, N), however, this loop length is reached rather quickly. Therefore, 
we have built in the possibility for the algorithm to stop trying to remove a given loop when after max-swap trials, 
no suitable swap has been obtained. By choosing max-swap sufficiently large, the maximum removable loop length is 
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easily detected. In practice we find that for all loop lengths that can be removed, we typically need 1 , and occasionally 
2 trial swaps per loop. 

In figEl we show the distribution of loops over the (c,d,N) ensemble, for (c,d) = (3,4), for N = 10 4 and averaged 
over 10 4 codes, up to loops of length £ = 4 (corresponding to length 8 cycles in the IT terminology [24[), before 
and after removal of shorter loops. In general we observe that the Poisson distribution with A given in (|14fl fits the 
simulations very well, for all £ not exceeding the maximal removable loop length, while it breaks down above that. 

0.25 | 1 1 1 1 



0.2 



0.15 




50 100 150 200 



FIG. 9: The distribution of Sloops (£ = 1,2,3,4) for (c,d,N) = (3,4, 10 4 ): lines -> theory, MB ensemble -> "+", MLL-£ 
ensemble (i.e. after removal of all smaller loops than the size plotted) — * "x", sampled from 10 4 random code constructions. 

Note that, in principle, this method also could be used to obtain Gallager's ideally expurgated ensemble. We would 
start by finding and removing the most dangerous diagrams, and then move on to the next generation of most 
dangerous diagrams, and so on. However, the next most dangerous diagrams are obtained by adding additional 
vertices (and necessary edges) and/or by removing edges from the current most dangerous diagrams. One can easily 
convince oneself that the number of next most dangerous diagrams soon becomes enormous. In addition for each 
generation we only reduce the polynomial error probability by a factor N" 1 . Therefore, although in principle possible, 
this method is not practical, and we have opted for the removal of loops. The fact that we only have to look for one 
type of diagram (i.e. loops), and the fact that we expurgate many entire generations of next most dangerous diagrams 
in one go, makes the cost of over-expurgating the ensemble a small one to pay. 



VI. SUMMARY 



In summary, we have developed a method to directly evaluate the asymptotic behaviour of the average proba- 
bility with respect to the block decoding error for various types of low density parity check code ensembles using 
diagrammatic techniques. The method makes it possible to accurately assess the leading contribution with respect 
to the codeword length N of the average error probability which originates from a polynomially small fraction of 
poor codes in the ensemble, by identifying the most dangerous admissible diagrams in a given ensemble by a power 
counting scheme. The most dangerous diagrams are combinations of specific types of multiple closed paths (loops) 
in the bipartite graph representation of codes, and allow for codewords with low weights. The contribution of a 
diagram to the error probability becomes larger as the size of the diagram is smaller, which implies that one can 
reduce the average error probability by excluding all codes that contain any loops shorter than a given threshold £. 
We have theoretically clarified how well such a sub-optimal expurgation scheme improves the asymptotic behaviour. 
We have also provided a practical algorithm which can be carried out typically in a linear scale of N for creating such 
sub-optimally expurgated ensembles. The numerical experiments utilising the provided algorithm have verified the 
validity of the theoretical predictions. 

The current approach is relatively ea sy t o adapt to irregularly constructed codes 0, ^| , codes over the extended 
fields [l4l n~5| , other noise channels [Trl [j"t| , and other code constructions such as the MN codes . Work in that 
direction is currently underway. 
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Diagrams are finite sub-graphs. Provided that the graph is large and provided that the correlations between the 
different diagrams is not too strong, we can treat them as effectively independent to leading order in N, even when 
they have (many) vertices and/or edges in common. It then suffices to calculate the probability of occurrence of a 
single diagram, and to count how any times such a diagram could occur in the graph, in order to extract its overall 
expectation, allowing us to calculate all quantities that depend on it. 

To illustrate this, consider the following scenario. All diagrams we consider, consist of n v vertices and n e edges, with 
at least 2 links arriving to each of the nodes from within the diagram. Suppose now that we replace a single node 
(vertex or edge), with another one not from within the diagram. Since the probability for each link to be present 
is ~ there are is at least a 4 link difference between the diagrams, thus making the correlation between them 
negligible to leading order. 

The rules for calculating the combinatorial pre-factor of the diagram are easily described as follows: 
Consider all possible sub-groups of n v vertices and n e edges. Calculate the probability P g that a given group of n v 
vertices and n e edges forms the diagram we're interested in. Since we assume that (to leading order in N) these 
probabilities are independent for all groups, we just have to multiply P g with all the possible ways of picking n v 



Combined this leads to the following simple recipe for the calculation a the contribution of a diagram to Pf\ 

• for each vertex from which x links depart, add a factor N c!/(c — x)\. 

• for each edge from which x links depart, add a factor L d\/(d — x)\. 



large N. 
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• for each link, add a factor 

• divide by the number of symmetries, i.e. the number of permutations of vertices, edges or links, that lead to 
the same diagram. 



a. calculation of the diagrams in fio!31 

We calculate the probability P/(l, § ) that a combination of 1 vertex v, and | edges forms the left diagram figOin 
the following steps: 

• 1 vertex with c links (Ncl). 

• § edges with 2 links ((Ld(d-l))i). 
. c links ((^) c ). 

• symmetry: | double links (2§). 

• symmetry: permutation of the edges ((§)!)• 

So, combined we have that 

p f n t) ~ _i_(d(d_i)/2)i J _ (Al) 

and after some reworking we obtain Q . 

We calculate the probability P2, c ,n that a combination of 2 vertices, and c edges forms the right diagram in 
fig|3|with k = k in the following steps: 

• 2 vertices with c links ((TVc!) 2 ). 

• c edges with 2 links ((Ld(d~l)) c ). 
. 2c links ((^) 2c ). 

• symmetry: permute edges in groups of k (k! 2 ). 

• symmetry: permute edges in group of c — 2k ((c — 2k)!). 

• symmetry: 2k double links (2 2fe ). 

• symmetry: simultaneously permute the vertices and the groups of k edges (2). 
So, combined we have that 



and after some reworking we obtain (|SJ) and . 



2 2fe fc! 2 (c-2fc)! 



N 2 L C 

{d{d - 1))c WW ' (A2) 



b. calculation of the diagram in fig^ 

We calculate the probability P/(c+l, c(c+l)/2) that a combination of c+1 vertices, and c(c+l)/2 edges forms diagram 
in figQJin the following steps: 

• c+1 vertices with c links ((iVc!) 1 ^ 1 ) 

• ^± edges with 2 links ((Ld(d- 1))^ ). 

. c(c+l) links ((^) c < c+1 )). 

• symmetry: permute vertices ((c+1)!). 

• symmetry: permute edges (f^p^!)' 
So, combined we have that 

c ,c+1 clow N^L^r 1 

p^^m* w-D)^- , (a 3 ) 

and after some reworking we obtain (|l(Jfl . 
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c. distribution of the number of loops of length I 

The probability Pg(k) that there are k loops of length £ (i.e. including £ vertices and edges of the bipartite graph), 
can be calculated from diagram figlHl By power counting it is easily checked that the probability for any loop length 
£ to occur is 0(1). Therefore, we adapt a slightly different strategy compared to the diagrams above, (this also 
illustrates where some of the rules of our recipe originate from) 

First we calculate the probability P^ g that a given group of £ vertices (and £ edges) forms a "true" £-\oop in 
the following steps: 

• We order the £ vertices into a ring (|j ways). 

• For each pair of consecutive vertices we pick on of the edges to connect to both (t\ ways). 

• For each vertex choose a link to each edge it is connected to (c(c— 1) ways). 

• For each edge choose a link to each vertex it is connected to (d{d— 1) ways). 

• The probability that a chosen left and right link are connected is given by 

So, combined we have that 

/|2 / 1 \2* 

---(c(c-l)d(d-l)f — 
2£ Ky ' v " \cN 



P As ~_( c (c-l)d(d- !))*( — ) (A4) 



There are yij ways to pick the vertices, and yij ways to pick the edges. 

We want exactly k of these to form a loop, and (^g^J f ^ 1 — k of these not to form an £-loop, therefore: 

' N \ ( L 



p w = ( N e )( L e ) p L( i - p ^ £ 4 op( " A<) (A5) 

Note that the exclusion (or not) of shorter loops, has no influence on the leading order of Pe(k), since the probability 
of having a short-cut i.e. another edge that connects 2 vertices from within the group of £ (or vertex that connects 2 
edges from within the group of £ ) , requires 2 extra links to be present which adds a factor ~ (^rp to ^ ne probability 
Pi, g , and is therefore negligible. 



APPENDIX B: 



As shown, for a given code ensemble, the probability Pf that a finite group of n v bits can be collectively flipped, is 
completely dominated by sub-sets of size n*, such that Pf = Pf(n*). From this we can then determine the polynomial 
error probability Pb, which depends on the decoding scheme employed. Here, we concentrate on the BSC(p, 1 — p) 
for the following decoding schemes: 

1. ML decoding [7|: Since this decoding scheme selects the code word with the lowest weight, an error occurs when 
the n* collectively flipped bits have a lower weight than the original ones. When the n* collectively flipped bits 
have an equal weight to the original ones, we declare an error with probability i, such that one immediately 
obtains 

2. MPM decoding [H, IS3, Q : This decoding scheme selects the code word that maximizes the marginal posterior, 
and minimizes the bit error rate (or in a statistical physics framework that minimizes the free energy at the 
Nishimori temperature |22|). 

Effectively this attributes a posterior probability exp(/3Fw(n)) / Z to each codeword n, where (3F = In (iZ^)> 

and where Z = ^2' n exp(/3Fw(n)), with ^2' n being the sum over all code words. Since we assume that we 
are in the decodable region, we have that Z ~ exp(/3Fw(no)) + exp(/3Fw(n/)) with n/ being no with n* 
bits flipped. Hence, by selecting the solution with the maximal marginal posterior probability we obtain that 
PMPM{e\n* v ) = P M i(e|n*) as given in 0. 
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3. Typical set decoding (TS) 0,H^|: This decoding scheme randomly selects a code word from the typical set. We 
declare an error when a noise different from n is selected. Hence the error probability is given by nt ^~ 1 , where 
n ts is the number of code words in the typical set. Since we are in the decodable region, for n* ~ 0(1), the 
original and the flipped code word are both (and the only) codewords in the typical set, such that 

PrsieK) = \. (Bl) 

Note that TS decoding has an inferior performance for Pb compared to ML and MPM decoding. 



